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Abstract 



We focus on designing combinatorial algorithms for the Capacitated Network Design problem (Cap- 
SNDP). The Cap-SNDP is the problem of satisfying connectivity requirements when edges have costs and 
hard capacities. We begin by showing that the Group Steiner Tree is a special case of Cap-SNDP even when 
there is connectivity requirement between only one source-sink pair. This implies the first poly-logarithmic 
lower bound for the Cap-SNDP. 

We next provide combinatorial algorithms for several special cases of this problem. The Cap-SNDP is 
equivalent to its special case where every edge has either zero cost or infinite capacity. We consider a special 
case, called Connected Cap-SNDP, where all infinite-capacity edges in the solution are required to form a 
connected component containing the sinks. This problem is motivated by its similarity to the Connected 
Facility Location problem |20l[3Tl . We solve this problem by reducing it to Submodular Tree Cover, which 
is a common generalization of Connected Cap-SNDP and Group Steiner Tree. We generalize the recursive 
greedy algorithm iflOl achieving a poly-logarithmic approximation algorithm for Submodular Tree Cover. 
This result is interesting in its own right and gives the first poly-logarithmic approximation algorithms for 
Connected hard capacities set multi-cover and Connected source location. 

We then study another special case of Cap-SNDP called Unbalanced-P2P. Besides its practical appli- 
cations to shift design problems [13], it generalizes many problems such as fc-MST, Steiner Forest and 
Point-to-Point Connection. We give a combinatorial logarithmic approximation algorithm for this problem 
by reducing it to degree-bounded SNDP. 
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1 Introduction 

1.1 Capacitated Survivable Network Design 

The main topic of this paper is the following fundamental network design problem. 

Capacitated Survivable Network Design (Cap-SNDP) 

Instance: An undirected graph G = (V, E) with edge-capacities {u e \ e G E} and edge-costs {c e | e G 
E}, and requirements {r^ | i, j G y} (c e , u e , and r^- are all non-negative integers). 
Objective: Find a minimum cost spanning subgraph H of G such that for every i, j G V, the capacity of 
any ij-cut in H is at least r^- (equivalently, H contains zj-paths such that every edge e belongs to at 
most u e paths). 

The special case with a single source and a single sink is called fixed cost flow [ 15 ]. When all edge-capacities 
are unit, cap-sndp reduces to the Survivable network design problem (SNDP) |[T8ll23l . which generalizes Steiner 
forest problem [2]) when the connectivity requirements are in {0, 1} and Steiner tree problem IPT51 when con- 
nectivity requirements are in {0, 1} and all sinks are identical. Unlike these classical special cases, however, 
the approximability of cap-sndp is not well understood; not even a logarithmic hardness is known, and at the 
same time no better than o(|i?|)-approximation algorithm is known, even for very restricted settings. 

The cap-sndp also generalizes the following buy-at-bulk-type network design problem. Given an undirected 
graph G = (V,E) where each edge e G E is associated with a non-decreasing cost function f e that specifies 
the cost f e {u) of installing u units of capacity on edge e. The instance also gives connectivity requirements 
{ r ij I h j ^ V}- The problem is to decide the capacity u e to be installed on edge e so that the capacity of any 
ij-cut is at least and the total cost J2eeE fe(ue) is minimized. Note that both the recently studied versions, 
namely one with economies of scale in which functions f e are concave ifTTl l30l and one with dis-economies 
of scale in which functions f e are non-concave Q, are special cases of this problem. Assuming that all the 
involved numbers are polynomially bounded integers, we can reduce this problem to cap-sndp by replacing each 
edge e with parallel edges ei, &%, . . . , &r where edge has capacity u ek = 1 and cost c efs = f e {k) — f e {k — 1). 
Herei? — maxj j rij. If the numbers are not polynomially bounded, we can use standard scaling techniques to 
get a polynomial reduction, while losing a constant factor in the approximation guarantee. 

It is easy to see that Cap-sndp is equivalent to its special case where each edge has either infinite (or suffi- 
ciently large) capacity or zero cost. Such a reduction is done by replacing each edge e of capacity u e and cost 
c e by a path of length two with two edges e\ and e2 where u ei = oo, c ei = c e and u e2 = u e , c e2 = 0. We 
call an edge with infinite capacity a cost-edge and an edge with zero cost a capacity-edge. From now on, we 
assume that our Cap-sndp instances satisfy this property. 

1.2 Connected Single-Sink Capacitated Survivable Network Design 

We next consider the following special case of cap-sndp. This special case is motivated by its similarity with 
Connected Facility Location problem where the open facilities are required to be connected by a backbone 
network EOll-ffl . 

Connected Single-Sink Capacitated Survivable Network Design (Connected Cap-SNDP) 

Instance: An undirected graph G = {V, E u U E c ) where E u is the set of capacity-edges (with integer 
capacities u e and zero cost) and E c is the set of cost-edges (with integer costs c e and infinite capacity), a 
sink t £ V and sources si, . . . , s& G V with their integer requirements n, . . . , Vf.. 

Objective: Find a minimum-cost subgraph H of G such that the minimum Sji-cut in H has capacity at 
least Vi (for i G [k]) and H n E c forms a connected (backbone) graph containing t. 

Our first main result is as follows. Let n denote the number of nodes in G. 
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Theorem 1.1 Even the single-source version of connected cap-sndp is r2(log 2_e n)-hard to approximate 
for any e > 0, unless NP has a Las Vegas quasi-polynomial-time algorithm. Thus Cap-sndp is also 
rj(log 2_e n)-hard to approximate for any e > 0, unless NP has a Las Vegas quasi-polynomial-time al- 
gorithm. Here n = \V\ denotes the number of vertices. 

Theorem 1.2 There exists a polynomial-time combinatorial f2(log 2+e n • log X^=i ^^-approximation al- 
gorithm for Connected CAP-SNDP/or any e > 0. 

We prove Theorem by giving an approximation ratio preserving reduction from the well-knwon Group 
Steiner tree problem to the single-source version of connected Cap-sndp. Recall that the Group Steiner tree 
problem is defined as follows. The instance is given by an undirected graph G = (V, E) with edge-costs c e 
and subsets (groups) of nodes g% , . . . , C V and the objective is to find a minimum-cost subtree H of G that 
contains at least one node from every group. Halperin and Krauthgamer Il22l state the lower bound in the form 
of ri(log 2_e k) where k is the number of groups. However, the size of their construction is 0{k). In particular, 
the number of nodes they have in the tree is less than 2k. We present the lower bound in terms of the number 
n of nodes in their tree, which is between k and 2k. Since the lower bound is poly logarithmic, k and n are 
essentially the same for our purposes. 

1.3 SUBMODULAR TREE COVER 

We prove Theorem |1.2| by presenting an algorithm for a very interesting generalization of Connected cap-sndp 
called Submodular tree Cover. We define Submodular tree cover below and show that it in fact generalizes 
several other interesting problems. Let U be a ground-set U. A function set-function / : 2 U — > Z is called 
non-decreasing if f(A) < f(B) for all A C B C U, and is called submodular if f(A) + f(B) > f{A D B) + 
f(A U B) for all A, B C U. 

Submodular Tree Cover 

Instance: An undirected graph G = (V, E) with edge-costs {c e | e E E} and a non-decreasing submodular 
function / : 2 V — > Z. The function / is by a value oracle that returns f(S) when given S C V. 
Objective: Find a minimum-cost sub-tree T = (Vr, Et) of G such that f(Vr) = f{V). 

We show the following algorithmic result for Submodular tree Cover. Let n denote the number of nodes in G 
and F max = f{V) = maxscy f(S) be the maximum value of /. For the purpose of this paper, we assume that 
^max is polynomially bounded in n. 

Theorem 1.3 There exists a polynomial-time combinatorial 0(log 2+e n ■ log F max ) -approximation algo- 
rithm for Submodular Tree Cover, for any e > 0. 

We now argue that Submodular tree cover generalizes the following interesting problems. 
• Connected Cap-SNDP. 

Given an instance (G = (V, E u n E c ), t £ V, s±, . . . , Sk £ V, r\, . . . , r& > 0) of Connected cap-sndp, we 
construct an instance of Submodular tree cover as follows. Let G c = (V, E c ) be the graph on the cost-edges 
with costs {c e | e £ E c } inherited from the Connected Cap-sndp instance. Similarly let G u = (V, E u ) be the 
graph on the capacity-edges with capacities {u e \ e £ E u } inherited from the connected cap-sndp instance. 
Given S C V and i £ [k], let u(5(S,Si)) denote the capacity of the minimum capacity cut in G u that 
separates S{ from all vertices in S. Now define a set-function / : 2 V — > Z as follows. For let 

k 

f( s ) = ^2mm{n,u(d(S, Si))}. 
i=i 

It is easy to see that / is non-decreasing. To show that / is submodular, it is enough to argue that u(5(S, Sj)) 
is submodular for any i. Now for any two sets Sj C V for j = 1,2, let Sj C Cj jf Sj be the minimum 
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capacity cuts that separate Sj from Sj. Note that u(5(d)) + u(5(C 2 )) > u(5(Ci n C 2 )) + «(5(Ci U C 2 )). 
Since C\ n C2 (resp., Ci U C2) separate Sj from S\ n S 2 (resp., Si U S2), the claim follows. Note also 
that that there is a polynomial-time value oracle based on minimum cut computations. Theorem 1 1 . 3 1 implies 
Theorem ll.2l 

The source location problem studied by Bar-Ilan et al. Q can be thought of as a special case of Connected 
Cap-sndp. Our result gives the first non-trivial approximation algorithm for this problem in the setting of 
general graph connectivity cost functions. 

Group Steiner tree with group -demands and node-capacities. 

The instance of this problem is the same as for Group Steiner tree except that in addition, there is a demand 
di for every group gi,i G [k] and a capacity b v for every node v G V. The objective is to compute a 
minimum-cost sub-tree T = (Vp, Et) of G and assign each node v G Vr to one or more groups such that 

• if v G Vt is assigned to a group <?j, i G [&], then u G 

• each node v G Vr is assigned to at most 6^ groups, and 

• at least di nodes are assigned to each group gi, i G [k]. 

The fact that this problem is a special case of submodular tree cover can be shown with the function / : 
2 V — > Z defined below. Fix a subset Scy and construct a flow network N as follows. The vertices in N 
are source, sink, z v for S G V and for groups i G [A;]. The directed arcs in N are (source, z v ) with 
capacity for v G S, (z„, yj) with capacity 1 for i G [k] and u G gi n S and sink) with capacity dj 
for i G [fc]. Define /(S) to be the maximum flow that can be routed from source to sink in N. It is easy 



to see that / is a non-decreasing and submodular function. Theorem 1.3 implies 0(log 2+e n • log^* =1 di) 
approximation algorithm for this problem. 

The special case with b v = 00 for all nodes v is called a Covering Steiner tree problem and is known to admit 
an 0(log 3 n)-approximation ratio l25l [T3l I2H . However we present the first poly-logaritmic approximation 
algorithm for the general node-capacities case. The group steiner tree problem with group-demands and 
node-capacities also generalizes other covering problems where the cover is required to be connected in 
some graph, like Connected dominating set problem or Connected set cover with hard capacities problem. 



1.4 Unbalanced Point to Point Connection 



We next define a very important special case of cap-sndp called the unbalanced point to point connection 
(unbalanced-P2p) problem. This problem is motivated by a so-called Shift scheduling problem with several 
practical applications to workforce scheduling. A solution to the shift design problem has been included in a 
product called OPA from Ximes Gmbh. (T). See |[T3ll29l for more details. 

Unbalanced Point to Point Connection (Unbalanced-P2P) 

Instance: An undirected graph G = {V,E) with edge-costs {c e | e G E} and integer charges {b v : v G V}. 
Objective: Find a minimum-cost subgraph H of G such that b(H') := YlveH' ^ — ^ f° r everv connected 
component H' of H. 

It is easy to see that the problem has a feasible solution if, and only if, G is a feasible solution, i.e., every 
connected component C of G satisfies b(C) > 0, and that any inclusion-minimal solution is a forest. Given 
an instance of unbalanced-P2P, let V + = {v G V \ b v > 0} and let V~ = {v G V \ b v < 0}. The fact that 
unbalanced-P2p is & special case of cap-sndp, even in the case of single demand, can be seen as follows. Given 
an instance of unbalanced-P2P, create a graph G' by adding to G two new nodes s and t and edges (s, v) for all 
v G V~ and (v, t) for v G V + . The original edges in G inherit their cost c e and get infinite capacity. The new 
edges (s, v) for v G V~ get capacity \b v \ and zero cost and the new edges (v, t) for v G V + get capacity b v and 
zero cost. The nodes s, t for the source-sink pair with connectivity requirement | X^eV- ^1- 
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Our result for unbalanced-P2P is as follows. 



Theorem 1.4 There exists a polynomial-time combinatorial 2- approximation algorithm for the special 
case of unbalanced-P2p with b(V) := Ylvev^v = 0. Furthermore, if the charges {b v : v G V} are 
polynomially bounded in \V\, Unbalanced-P2P admits an exact algorithm on trees instances (i.e., G is a 
tree) and ratio 0(logmin{n', 2 + b(V)}) on general graphs, where n = \V + U V~\ is the number of 
nodes with non-zero charge. 

Apart from being very important in practice, unbalanced-P2P generalizes the following important problems. 

• Point-to-Point Connection |[T9ll . This problem is exactly to a special case of unbalanced-P2p with b v G 
{-1,0,1} for all v G V and |V + | = \V~\, i.e., £ v b v = 0. Goemans and Williamson 11191 give a 
(2 — jytj) -approximation algorithm for this problem. 

• fc-Steiner Tree [12]. The instance of this problem is given by an undirected graph G = (V,E) with edge-costs 
{c e | e G E}, a subset U C V of terminals and an integer k < \U\ and the goal is to find a minimum-cost 
tree in G that contains at least k terminals. The case U = V is the /c-MST problem [16]. The fc-Steiner tree 
problem reduces to unbalanced-P2P as follows: "guess" a terminal s that belongs to some optimal solution 
and set b s = —(k — 1), bt = I for all t G U \ {s}, and b v = otherwise. 

• Steiner Forest problem [19]. The instance of this problem is given by an undirected graph G = (V,E) 
with edge-costs {c e | e 6 E} and k pairs of terminals si£i, . . . , Sfcifc and the goal is to find a minimum- 
cost subgraph of G that connects Sj to ij for all i G [fc]. Without loss of generality, we can assume that 
these pairs of terminals do not share a node. This problem reduces to unbalanced-P2p as follows: for i G 
[k], set b Si = 2\ 6^ = — 2\ and b v = otherwise. We argue that any feasible solution to this instance 
of unbalanced-P2P connects s, to £j for all i G [fc] and vice versa. Since Ylv&v ^ v = ^ eacn connect ed 
component in a feasible solution must have total charge zero. Thus the total positive charge (written out in 
the binary representation) must equal the absolute value of total negative charge (written out in the binary 
representation) in any connected component. Thus any connected component contains Sj if and only if it 
contains t{ for any i G [k]. 

1.5 Previous work 

The cap-sndp is one of the most fundamental problems in combinatorial optimization. Even the Fixed-Cost Flow 
problem (the case of a single source and single sink) already includes several fundamental problems. Krumke et 
al. ll28ll proved a logarithmic hardness of the directed version, and gave a fc-approximation algorithm, where k is 
the requirement of the single pair. The special case of directed cap-sndp namely, directed Fixed-Cost Flow was 
shown to be Label-Cover hard by Even et al. lfT3l in 2002, which implies the same lower bound for cap-sndp. 
Eight years later, the same hardness was rediscovered independently by Chekuri et al. 0. 

Goemans et al. |fT9l are the first who consider approximation algorithms for cap-sndp with multiple pairs. 
However they mainly consider "soft capacities", where multiple copies of an edge are allowed. Carr et al. Q 
observed that the natural cut-based LP-relaxation has an unbounded integrality gap even for the unicast case. 
Motivated by this observation they strengthened the basic cut-based LP by adding so called Knapsack-Cover 
inequalities. Using these inequalities, they obtained constant factor approximation algorithms for some special 
graph topologies. However, in the general case, the integrality gap of the basic cut-based LP enhanced by 
Knapsack-Cover inequalities is G(n 2 ). Very recently, Chekuri et al. [8] considered various special cases of 
cap-sndp. For soft capacities, they give an 0(log k) upper bound where k is the number if pairs with positive 
requirement and 0(log n) approximation ratio for the case when are equal for all i, j G V. They also show 
SI (log log n) hardness result for the case of soft capacities. They gave no hardness result for the hard capacity 
case, as in Cap-sndp. Approximation ratios or hardness results for the soft capacities case do not extend to 
cap-sndp. 
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A related problem, that also generalizes the Survivable Network problem (but without capacities) is the 
Node-Weighted Survivable Network problem ||26l|24l; in this problem the costs/weights are on the nodes, every 
edge has capacity 1 and cost 0, and the cost of a subgraph is the sum of the cost of its nodes. The best known 
ratio for this problem is 0(r max log n) ll2"6ll . where r max = maxj Jg y is the maximum requirement. 

Garg, Konjevod, and Ravi ifTTl present an 0(log N • log /^-approximation algorithm for group steiner tree 
on tree where k is the number of groups, and iV is the maximum size of a group. Zosin and Khuller [34] give 
an alternative primal-dual approximation algorithm that solves an exponential linear program, and has ratio 
0(log 2 n). The first combinatorial polylogarithmic algorithm is by Chekuri et al. ifTOl . that used the recursive 
greedy technique (see [|9j |27l [33)), to obtain ratio 0(log 2+e n). All the above upper bounds are closed to the 
best possible as Halperin and Krauthgamer [22] give a lower bound of Q(log 2 ~ e n) for any fixed e, unless NP 
has a quasi-polynomial-time Las Vegas algorithm. 

Finally we list the best known approximation ratios for the other important special cases of cap-sndp. The 
best known ratio for fc-MST is 2 [16] and for fc-Steiner Tree is 4 |[T2l (one way to get ratio 4 is to apply 
metric completion and move to the graph induced by terminals, loosing a factor of 2, and then using the 2 
approximation algorithm lfl6l for /c-MST on the graph induced by the terminals). The best approximation factor 
for Steiner Tree is roughly 1.39 @. For Steiner Forest, Point-to-Point connection, and Survivable Network, the 
best known ratio is 2, see El[T9l|23), respectively. 

Even et al. |[T3l obtain 0(log| YlveV- b v I) -approximation algorithm for unbalanced-P2p. Our 0(log(2 + 



^2 V £V &u|))-approximation algorithm result in Theorem 1.4 is incomparable 



1.6 Organization 

We begin by proving Theorem [TTT] in Section [2] Our recursive-greedy algorithm for submodular tree Cover is 
given in Section[3]and algorithms for unbalanced-P2P in Section|4] 



2 Hardness of Connected Cap-SNDP (Theorem 1.1) 



Given an instance (G = (V,E),{c e > | e G E}, r, {Si, . . . , %}) of Group steiner tree, we construct 
an instance of connected cap-sndp as follows (see Figure [T] for an illustration). For a positive integer k, let 
[k] = {1, . . . , k}. Construct a graph G + = (V+, E + ) from G by adding some new nodes and edges as follows. 
Let V+ = VU{s}U{gi \ i G [k]} and E + = FUFwhereF = {{s,v} \ v G U ie[fc] 5i} U {{v, g{\ \veSi,ie 
[k]} U {{gi, r} | i G [k]}. Each edge e G E is assigned cost c e and capacity u e = oo. Each edge e = {s, v} for 
v G UjiSj is assigned cost c e = and capacity u e = \{i \ v G Si, i G [k]}\, i.e., number of groups v belongs to. 
Each edge e = {v, gi} for v G Si,i G [k] is assigned cost c e = and capacity u e = 1. Each edge e = {gi, r} 
for i G [k] is assigned cost c e = and capacity u e = \Si\ — 1, i.e., one less than the number of nodes in group 
Si. Finally we set sink as t = r and demand as d = J2ie[k] 1^1 = ^2vev I v e * e Wll- 

Now we show the following one-to-one correspondence between the feasible solutions of the original Group 
Steiner tree and that of the created connected Cap-sndp instance. 

Lemma 2.1 There exists a solution for the Group Steiner tree with cost at most C if, and only if there exists a 
solution for Connected Cap-SNDP instance with cost at most C. Furthermore, the solution to Group Steiner Tree 
can be computed in polynomial time from that to Connected Cap-sndp instance, and vice versa. 

Let subtree T = (Vp, Et) be a solution of cost C to the group steiner tree instance. Let H = Et U F be a 
subgraph of G + . Since all edges in F have cost 0, the cost of H is also C. We now argue that H forms a feasible 
solution to the connected Cap-sndp instance, i.e., a flow of d units can be routed from s to t in H. We start by 
routing flow of U{ s v j = \{i \ v G Si, i G [k]}\ units from s to each v G Uj5j. Consider a node v G VrH (Uj5,). 
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Such a node forwards its entire flow to t = r along the unique path from it to r along the tree T. This flow can 
be supported since the edges in T have infinite capacity. Now consider a node v 6 (UjSi) \ Vr- Such a node 
forwards 1 unit of its received flow to each g± for which v S Si along the unit-capacity edge {v, gi}. Note that 
any node gi receives at most \Si\ — 1 units of flow from all the nodes v 6 Si. This is because at most \Si\ — 1 
nodes in Si do not belong to T, which in turn holds because T contains at least one node from Si. Lastly each 
node gi forwards its received flow to t = r along edge {g%,r} of capacity \Si\ — 1. Thus indeed H forms a 
feasible solution to the connected cap-sndp instance. 

Now let H be a solution of cost C to the con- 
nected Cap-sndp instance. Since all edges in F have 
zero cost, we can assume that F C H , without loss 
of generality. It is enough to prove that H n E 
contains a path from some node in Sj to r for each 
i 6 [k]. Suppose this is not true for some group Sj 
for j G [A;]. We extract an s-i-cut in graph H with ca- 
pacity strictly less than d contradicting the existence 
of flow of value d from s to i in H. Let U C V de- 
note the set of nodes connected to some node in Sj 
in H n E and let U = {s, gj} U U. Note that s £ U 
while from our assumption t ^ U. We now prove the 
following claim. 

Claim 2.2 The total capacity of edges in H that 
leave U is strictly less than d. 

Proof: It is easy to note that all the edges in H that 
leave U are (1) {gj, r} with capacity \Sj\ — 1, (2) 
{v, gi} with capacity 1, for all i ^ j and v € Si D U, 
and (3) {s, v} with capacity \{i \ v G Si, i G [k]} \ for 
aUv<EV\U. Thus the total capacity of these edges 
is 



(0,IS,I-1) 




(0,IS 2 I-1) 



(cost, capacity) 



Figure 1 : The instance of Connected Cap-sndp created in 
the reduction from Group Steiner tree. The labels on the 
edges denote (cost, capacity). Not all labels are shown 
in the figure. 



+ E E 1+ E \{i\v£Si,ie[k]}\ 
#j ves.nu vev\u 

= \Sj\ -l + J^lii I v£S u i£ [k],ijtj}\+ \{i\v€Si,i€[k]}\ 
veu v&v\u 

= -l + X)l{*l ue ^ G [ fc ]}l+ Yl \{i\veSi,i€[k]}\ 

v&u vev\u 

= -l + J2\{i\v€Si,i€[k]}\ 

= d-1. 
The claim follows. 

The above claim implies Lemma 2. 1 and thus the proof of Theorem [T7T] is complete. 
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3 Approximating Submodular Tree Cover (Theorem 



1.3) 



3.1 Preliminaries and notations 

Given a set U, a function / : 2 U — > Z, a subset S C J7 and an element x £ U, denote /s(:r) = f(S U 
{x}) — /(<5) and fs(T) = f(S U T) — f(S). We say that / obeys the improvement independence axiom 
if for every pair of subsets S,T C U such that S Q T, ^ ueT \ s fs(u) > f(T) - f(S). We recall the 

following equivalence from |[32l : an increasing function / : 2 U — > Z is submodular if and only if it obeys the 
improvement independence axiom. 

Let / : 2 U -»• Z be a non-decreasing submodular function. By subtracting /(0) from all values of /, we 
assume without loss of generality that /(0) = 0. Thus f(A) > for all A C U. For any two subsets A,B^U, 
since f(A nB)>0, the submodularity of / implies f(A) + f(B) > f(A U B). 

We probabilistically embed the given graph into a tree metric losing O(logn) factor in the approximation, 
by using the results of Bartal [5] and Fakcharoenphol, Rao and Talwar lfl4ll . There is a one-to-one corre- 
spondence between the original vertices V and the set L of leaves of a single embedding T. Using standard 
techniques, we also assume, without loss of generality, that we are solving the problem on a rooted tree instance 
where the root is required to be included in the output tree. We also assume, without loss of generality, that all 
leaves of T are at the same level, i.e., level h(T) where h(T) denotes the height of tree T. 

The parent of a non-root node v is denoted by p(v). The subtree rooted at a node v is denoted by T v . Let 
e = (u, v) be an edge where u is the parent of v. The subtree induced by the edge (u, v) is the tree T v U{(u, v)}, 
namely, the tree T v in addition to the edge (u, v) and the node u. We denote the subtree induced by the edge 
(u, v) by T/ UfV y Let n be the number of nodes in T. 

The algorithm is recursive. In a general step of the algorithm, we have a tree T that is to be included in the 
solution and we are computing an augmentation tree T' to satisfy some demand. We abuse the notation and for 
a tree T', use f(T') to denote f(L(T')) where L(T') C L denotes the set of leaves included in T' . Similarly, 
we use ff(T') to denote f(L(f U T')) — f(L(f)), i.e., the increase in /-value due to the addition of V to f. 

Let den^(T') = c(T')/ff(T') denote the density, or cost to profit ratio, for subtree T . Here c(T') = 
X^eeT" c e denotes the cost of the tree T'. From submodularity of /, we get that for a collection of trees {Tj}? =1 , 



3.2 Intuition 

The algorithm and analysis is more complicated than that of the combinatorial group Steiner tree algorithm of 
Chekuri et al. ifTUl . Some of the complications are pointed out below. Some steps we take are: 

• Using submodularity of /, we show that one can ignore subtrees of the optimum with low /-value. The proof 
is different (more detailed) than the one in iTTOl . 

• We have to guess, given some root r, the extent by which the tree rooted at r in OPT, increases the current /- 
value. For efficiency reasons, we cannot check all values, and so we search in powers of roughly 1 + 1/ log n. 
The fact that we dont search on all values creates a problem, the solution of which will become evident when 
the algorithm is given. 

• We use the fact that we never make recursive calls with very small increase in /-value (because we ignore 
"small" trees) and we use geometric search on the amount of increase, to show the polynomial the running 




(1) 



time. 
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• We are not increasing the number of terminals as in ifTOll . but the value of /. In [ 10] as the terminals were 
"new", it was clear that those increases are independent. In our case, if we add trees T\, T2, . . . , T p in this 
order, we need to show that the increase in /-value is large even after the previous trees were added, namely 
we need to show that Y^j=i ffyj(yj j ~ 1 T)^j) * s large. The proof is more complicated than one in lilOl . 

• Loosely speaking, we call the algorithm with target z of increase in /-value and a tree of height h! . We stop 
at a critical point when the increase in /-value is at least z/h'. The density of this tree is used in the analysis. 
This is a point at which the density of the optimum solution has not change much yet. The original optimum 
was a candidate for addition by the algorithm in all previous iterations and its density is not much worse than 
the density of the optimum with respect to the empty set. We get a telescopic product that shows that the 
density derived is about O(h') times the optimum density. 

3.3 Height and degree reductions 

We first recall the height and degree reductions from Chekuri et al. iPTOl . 

Claim 3.1 H10H There exists a combinatorial linear time algorithm that, given an instance 0/ submodular tree 
Cover on a rooted tree T with t leaves, achieves the following. For an integer parameter a, computes an 
instance T' of Submodular tree Cover such that the height ofT' is 0(log a i) and for feasible solution S for T 
there exists a feasible solution S' for T' so that c(S') < 0(a) ■ c(S), and for every feasible solution S' ofT', a 
feasible solution S ofT can be computed in linear time such that c(S) < c(S'). 

Claim 3.2 HI OH There exists a combinatorial linear time algorithm that, given an instance of the submodular 
tree Cover on a rooted tree T with £ leaves and an integer parameter f3 > 3, computes a rooted tree T' with 
height h(T) + |~log^ / 2 n \ such that every node has at most f3 children. Moreover, for every feasible solution S' 
for T', there exists a feasible solution S for T with the same weight, and vice versa. 

For some e > 0, we set a = (3 = log 6 n and assume that the height is at most O(logi og e n logre) = 
0(logn/eloglogn), maximum degree is 0(log e n), and the penalty in the approximation ratio is 0(log e n). 
For a node v, let deg(i>) denote the number of children of v. 

3.4 Ignoring small trees and geometric search 

In a general step of the algorithm, suppose T is the tree already included in the solution. To simplify the 
notation in the rest of this subsection, we assume that T = and update the definition of / accordingly, i.e., 
we use f(T') to denote ff(T') and den(T') to denote den^(T'). Note that such a change does not affect 
non-negativity, monotonicity and submodularity of /. Suppose our current target is to find an augmentation 
tree T' , rooted at some vertex r, so that f(T') > z where z is the target increase amount. The vertex r and 
the augmentation amount z are fixed throughout this subsection. Let T* be the minimum cost tree so that 
f(T*) > z. 

For a child u of r, let Tf ^ = T* n T^.u) denote the subtree of T* that hangs from the edge (r, u). We 
let A = l/h(T), where h(T) = O (log n/ log log n) is the height of the entire tree T. We call T,* s small if 
/(Tfx) < deg ( r ).(i + i/A) ; and bi S otherwise. Let T be the forest of small trees. Let T big = T*\T. We now 
show that the density of T g is not much larger than the density of T* . 

Lemma 3.3 (Ignore small trees) den(T w «) < (1 + A) • den(T*). 
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Proof: Let T\ , . . . , T p be the trees in F. Using inequality (|T|) we get 

AT) < /P*) + £ /W) < /<!*) + deg W . deg(r) . ( Z 1 + 1/A) - /P*) + 



i=l 

Therefore 



/P-) > /P*) - j V ^ W) > /P*) (i - (r^) = /P* 



+ \ 

Thus, den(T bi g) = c(T bi g) //(T bi s) < (1 + A) • c(T*)/f(T*) = (1 + A) • den(T*), as desired. ■ 

Since the density does not increase significantly, we are safe to ignore small trees. To increase /-value by 
z, we make several recursive calls with increments of /-value that are powers of 1 + A in the range 



deg(r) • (1 + 1/A) • (1 + A) 



Note that the lower end of this range is factor 1 + A smaller than the term used in the definition of small trees. 
We restrict the search to powers of 1 + A in order to ensure polynomial running time. 

3.5 Greedy augmentation algorithm 

Our algorithm is called Greedy-Augment. See Figure[2j The parameters are the vertex r and value z > 
and the goal is to find a tree rooted at r that augments the /-value by at least z. We add trees one by one. The 
union of trees added so far is denoted by C. As more trees are incorporated to C, /(C) gets larger and larger. 
The output of Greedy-Augment however may not end up augmenting /-value by at least z. If the height 
h(T r ) of the tree T r rooted at r is 1, we output a single edge. Otherwise, we make recursive calls and keep 
augmenting C till /(C) is at lease at least z. Let Ch be the value of C when we have /(C) > z/h(T r ) for the 
first time. We eventually output the best density tree among C and Ch- 

The following lemma bounds the running time of the algorithm Greedy-Augment. 

Lemma 3.4 [10] Let A be the maximum degree of the tree T r and let f3 = A(l + 1/A)(1 + A). The algorithm 
Greedy-Augment(r, z) takes 0(na h ( Tr ^) time and oracle calls to value oracle for f. Here a = (3 ■ h(T r ) ■ 
logz ■ A • log 1+A /3. Ifh(T r ) = 0(logn/loglogn), A = O(logn), 1 < 1/A = O(logn), z is polynomially 
bounded in n and if value oracle for f takes time polynomial in n, then the overall running time is polynomial 
in n. 

The proof of this lemma is similar to iffi)! and is omitted. The value oracle for submodular functions / needed 



for applications in Section 1.3 can be reduced to max-flow or min-cut algorithms. 
We next prove the approximation guarantees of this algorithm. 

Lemma 3.5 The output T ou t of Greedy- Augment satisfies 

den(T out ) < (1 + A) 2 ^) • h(T r ) ■ den(T*). 



3.5.1 Proof of Lemma 13.51 



The rest of this subsection is devoted to proving Lemma 3.5 The proof is by induction on the height of T r . 
For base case, h(T r ) = 1, we note that the optimum augmentation tree T* is a star and the output consists of a 
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Algorithm Greedy-Augment (r, z) : 

1. Initialize: C <- 0, Z <- z, and T res «- T r . 

2. Base case: If h(T r ) = 1, return the edge (r, it) where u is a child of r with minimum density den((r, u)) = 
c((r,u))/f(u). 

3. While Z > do: 

(a) Recurse: For every child it of r and for every z' that is a power of (1 + A) in [ d es ( r y(i + f/x)-(i+\) ; 
letC UjZ ' <— Greedy-Augment (it, z'). 

(b) Select: Let T aug <— arg min den(C niZ < U {(r, it)}) be the minimum density tree among those computed. 

(c) Update: 

i-C^CU T aug . 

ii. Update Z as: Z <- Z - f{T aug ). 

iii. Update function / as: let f(T') denote ff uC {T'). 

iv. If it is first time /(C) > z/h(T r ), then C h «- C. 

4. Return lower density tree among C and Ch- 

Figure 2: The Greedy-Augment algorithm for submodular tree cover 

single edge (r, u*). By submodularity of /, we have 

. c(T*) E(r, u )g(r») . c((r,u)) c((r,u*)) 

denlT ) = — — - > — 1_ — - — — - — > mm — —— — = — — — r — = denlfr, u )). 
V ) /(T * ) - S(piU)eT . f(u) ~ (,, u)e (T*) /(«) /(«*) U ^ 

Now we prove the induction step. The proof here is different than one in ifTOl . Recall that T blg is the union 
of big trees in T*. Decompose T bi § into the trees 17 \ U T?, U • • • U 1? y Here tree 2? , is a tree T*. 
rooted at child Uj of r plus the edge (r, Uj). Say that tree number i is rooted by m. Let z* = f(T* r U .A By a 

simple averaging argument, it follows that the density of at least one of the big subtrees is at most den(T blg ). 
Without loss of generality, assume that 

den(T ( ; ui) ) < den(T bi S) < (1 + A) • den(T*). (2) 

Let z\ = (1 + A)* be such that z\ < z\ < (1 + A) • z\. Note that z\ is in the range of powers of (1 + A) 
considered in line[3a]in Algorithm Greedy-Augment. Here we see why the least value of the search interval 
needs to be the term used to define small trees divided by 1 + A. We upper bound the density by considering 
a very specific recursive call. Then we can bound the density by the induction hypothesis. Consider the call 



C Ul ,zi ^— Greedy-Augment (u\, z\ ) in line 3a of Greedy-Augment. We now upper bound the density 
for that call. The tree C UlyZl is incrementally constructed from a sequence of augmenting trees, denoted by 
{Ri, i?2, • • •}■ Let j denote the smallest integer such that 

Note that when computing Rj, the /-value of the union is less than z\/h(T r ). During all the iterations of 
the while loop in which C U1 iZ1 is computed, the subtree T* is a feasible solution for the required increase in 
/-value to z\. Thus by definition, for p < j — 1, 
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Since / is non-decreasing and submodular, 

zi < 4 = < / ((u UT iJ ^ / (lw + Ajju*( r ' 

Plugging in the inequality Q we get 



/i(T r ) • 

Thus 

den uf =1 ^( r i)^^_ C ^ (Tr) 

The following establishes an upper bound on den^ ^. 
Claim 3.6 7<br aZZ p < j — 1, 

den ur Jii (i? p+1 ) < (1 + A) 2 ^^')- 2 • ft(T P ) • ^ 
Proof: By Inequality Q and the induction hypothesis we get: 



(i2p+i) < (l + X) 2h{Tui) -h(T ul )- 



z\ - z l /h(T r ) 



(i + a) 2 ^)- 2 • (h( Tr ) - 1) • c{T y ] 

z\ - zi/h(T r ) 

{1 + X) 2h(T r )-2. h{Tr y^)_ 



Here is a claim that is needed only as / is submodular. 
Claim 3.7 

den(C h ) < (1 + A) 2 ^)- 2 • h(T r ) ■ C ^ 

z\ 

Proof: Note that C h = R\ U R 2 ■ ■ ■ U Rj. Now 

c(C h ) J2Uic(R P ) 



den(Cft 



< max den, , P -i D (R v ) 
i<p<i Ui=1 1 

< (1 + X fKT T )-2 . ^ (Tr) C ( T « 



21 
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Finally, we bound the density of Tout. 

c(Ui<jRi) + C( riUl ) 



den(T out ) < 



f(Ui<jRi 



(by Claim 3.7 1 


< 


(1 + X ) 2h ^- 


2 • h(T r ) 


(by def. of zf) 


< 


(1 + X) 2h ( T r)- 


2 • h(T r ) 




< 


(i + X ) 2h ^y 


1 • h(T r ) 


(by definition) 
(byEq.[2]> 


< 


(1 + X) 2h ^ 


1 • h(T r ) 
• h(T r ) ■ d 



zi zi/h(T r ) 
c(T*J +h(T ^ c ( , ul) 



4/(1 + A) ^ " 4/(1 + A) 

g(gi) +C( r , Ul ) 

den ( T (; ul) ) 



This proves Lemma [33 



3.6 Putting things together 

We run Greedy-Augment iteratively till we obtain a tree with the maximum value F max of /. By a simple 
set-cover like argument, the overall running time is polynomial in n times log F max and the overall approxi- 
mation ratio for the tree-instances T is 0(log e n ■ h(T) • log-F max ) = 0(log 1+<E n ■ log-F max ). The first log e n 
factor comes due to height and degree reductions. For the graph instances, we get 0(log 2+e ra • logF max )- 
approximation, where another log n term comes due to approximating the general metric by tree metrices. 



4 Approximating GEN-P2P Connection (Theorem [L4|) 
4.1 A 2-approximation algorithm for the case b(V) = 

Our 2-approximation algorithm is an easy extension of the algorithm of ITT91IT8II for the Point-to-Point Connec- 
tion problem, which is the case b v G {—1,0, 1}. We say that an edge e covers a set S if e has exactly one 
endnode in S; we say that an edge-set/graph covers a set family F if for every S G F there is an edge in H 
covering S. Given a set-family F and an edge-set H the residual set-family Fr consists of the members of F 
not covered by H. Recall that a set-family F is uncrossable if for any X, Y £ F at least one of the following 
holds: X n y, X U Y G F or X \ Y, Y \ X G F. It is known and easy to see that if F is uncrossable, so is Fh, 
for any edge-set H. 

Goemans et al. |[T8ll give a primal-dual 2-approximation algorithm for the problem of finding a minimum- 
cost edge-cover of an uncrossable set-family F. A polynomial time implementation of this algorithm requires 
only that for any edge-set H, the minimal members of the residual set-family Fh can be computed in poly- 
nomial time (but F itself may not be given explicitly). Now the 2-approximation algorithm follows from the 
following lemma. 

Lemma 4.1 Given an instance of unbalanced-P2p with b(V) = 0, let F = {S C V \ 6(5) ^ 0}. Then the 
following holds. 

(i) An edge-set H C E is a feasible solution to unbalanced-P2P if and only if H covers F. 

(ii) For any edge set H C E, S is an inclusion-minimal members of Fh if an d only if S is a connected 
component of the graph (V, H) and b(S) ^ 0. 
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(iii) F is uncrossable. 

Proof: Parts (i) and (ii) are straightforward, so we prove only part (iii). Let X, Y G F, so b(X), b(Y) / 0. We 
will show that if XnY £ J^orifXUF £ F, thenX\y,Y\X G F. Suppose that XnY £ F, sob{XC\Y) = 0. 
Then b(X \ Y) = b(X) - b{X n Y) = b(X) / and b(Y \ X) = b(Y) - b(Y n X) = b(Y) / 0; hence 
X\Y,Y\X G J 7 . Suppose that XUY £ F, sob(XUY) = 0. Then6(X\Y) = b(XUY)-b(Y) = -b(Y) / 
and b(Y \X) = b(X U Y) - b(X) = -b(X) / 0; hence X \ Y, Y \ X G F. * 

4.2 An exact algorithm for trees 

We now focus on the case when the charges b v are polynomially bounded, but the total charge b(V) may not be 
zero. We show how to solve the problem on trees optimally, using dynamic programming. 

Root the tree T at some node s. By adding zero-cost edges to T if necessary, we can assume that T is a 
binary tree without loss of generality. In particular, if a node v has p children, we add a binary tree with p 
leaves at v and connect p leaves one-to-one to the p leaves. We give a cost of zero to each of the tree edges. It 
is easy to see that the instance essentially remains unchanged by this modification. For a node v G T, let T v 
denote the subtree hanging below v. The dynamic program computes quantities T(v, B) for all nodes v G T 
and integer B in the range E« & <o Yl u -b >o M- Since each b u is polynomially bounded, the number of 
such quantities is polynomial. The quantity T(v, B) is defined as the minimum-cost of a subgraph H of T v 
satisfying the following: 

• the connected component in H containing v has the total charge B, and 

• every other connected component in H has non-negative total charge. 

If there is no subgraph H satisfying the above conditions, we define T(v,B) as — oo. We assume that the 
minimum-cost subgraph H is also stored in the dynamic program table. 

The quantities T(v, B) can be computed as follows. For leaf nodes v, it is trivial to compute T(v, B) and 
the corresponding optimum subgraphs. For an internal node v, we compute T(v, B) as follows. Let u\ and U2 
be the two children of v. Depending on whether we pick edges (v, u\) or (v, 112) in the solution, we get four 
possibilities. 

1. If we pick none of these edges in the solution, we get a solution of cost min{T(iti, B\) + T(u2, B2) | 
B±,B2 > 0} corresponding to charge of the connected component containing v of b v . 

2. If we pick edge (v, ui) but do not pick edge (y, 1x2) in the solution, we get a solution of cost min{c(„ Ml ) + 
T(ui,Bi) + T(u2, B2) I B2 > 0} corresponding to charge of the connected component containing v of 

K + Bl 

3. If we pick edge (y, 112) but do not pick edge (v, u\) in the solution, we get a solution of cost min{c(„ n2 ) + 
T(u2, B2) + T(ui,Bi) I B\ > 0} corresponding to charge of the connected component containing v of 

b v + B 2 . 

4. If we pick both the edges (v,u\) and (v, 112) in the solution, we get a solution of cost min{c^ Ul ) + 
T(ui,Bi) + C(„ jU2 ) + T(u2, B2)} corresponding to charge of the connected component containing v of 

b v + Bi + B 2 . 

We consider all these possibilities and pick the minimum-cost solution corresponding to each value of the 
charge of the connected component containing v. 

Finally, we output the solution corresponding to min{T(s, B) \ B > 0}. It is easy to see that the above 
dynamic programming based algorithm computes the optimum solution our problem. 
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4.3 An 0(log n') -approximation algorithm where n' = \V + U V \ 

The algorithm is as follows. We reduce the general problem to the case when the input graph is a tree with a loss 
of O(logn') factor in the approximation ratio. This is achieved as follows. Consider the shortest-path metric 
on V' = V + U V~ w.r.t. the edge-costs c e . We probabilistically embed this metric into a tree metric T, d with 
O(logn') distortion using the results of Bartal (21 and Fakcharoenphol, Rao and Talwar JT41 . There is a one- 
to-one correspondence between V and the set L of leaves of T. The resulting instance of unbalanced-P2P on T 
inherits the charges on the leaves of T from the original charges on nodes of V', while the charge of internal 
nodes of T is 0. We compute an optimal solution to the obtained tree instance, and return the corresponding 
subgraph H of G. Note that any feasible solution with cost C for the original instance induces a solution 
with cost O(Clogn') for the new instance on tree T. Similarly any feasible solution with cost C for the new 
instance induces a solution with cost C for the original instance. Hence the approximation ratio is bounded by 
the distortion of the reduction, which is 0(log n'). 

Now consider the augmentation version of the problem, when we are give an edge subset E' C E of cost 0. 
Then we can contract every connected component F of (V, E') into a single node vp with charge &(uf) = b(F). 
Thus the approximation ratio in this case is O(logn')), where here n! is the number of connected components 
with non-zero charge in the graph (V,E'). 



4.4 An 0(log(2 + b(V))) -approximation algorithm 

Note that b(V) may be very small (close to even). 

Lemma 4.2 There exists a polynomial time algorithm that given an instance of unbalanced-P2p computes an 
edge set E' C E of cost < 4r*, where r* denotes the optimal solution value, such that the number n' of 
connected components with non-zero charge in the graph (V, E') is at most Ab{V). 

Proof: Consider the following procedure that runs with a parameter r, which is an estimate for t* . Create an 
instance of unbalanced-P2p with total charge zero by adding a new node s with charge —b(V) and connecting 
s to each node in V + by an edge of cost r/b(V). Then apply the 2-approximation algorithm for the case 
b(V) = 0. The new instance admits a solution of cost at most r* + b(V) ■ (r/6(V)) = r* + r, by taking 
an optimal solution to the original instance with edges that connect s to at most b(V) nodes in V + . Thus the 
procedure returns an edge-set of cost at most 2(r* + r). Consequently, if r > r* then the procedure returns 
an edge-set of cost at most 4r, and the number of edges incident to s is at most 4r/(r/6(V)) = 46(V). Using 
binary search, we find the minimum integer r for which the procedure returns an edge-set E" of cost 4r. Then 
c(E") < 4r < 4t* and the number of edges in E" incident to s is at most 4b(V). Let E' be obtained from E" 
by removing the edges incident to s. Then c(E') < c(E) < At*, and the number n' of connected components 
in (V, E') with non-zero-charge is at most the degree of s w.r.t. E", hence at most 4b(V), as claimed. ■ 



The entire algorithm has two steps. At step 1 we compute an edge set E' as in Lemma 4.2 Step 2 applies the 
0(log n'))-approximation algorithm from the previous section to compute an augmenting edge-set F C E\E' 
such that E'UF is a feasible solution. The solution cost is bounded by c{E') + c(F) = 0(r*)+0(logn')-r* = 
0(log(2 + 6(F)))-r*. 



5 Conclusions 

We present hardness results and combinatorial algorithms for several special cases of Cap-sndp. Naturally, 
obtaining a poly-logarithmic approximation algorithm for cap-sndp is a wide open question. 

It is also open whether one can achieve a constant ratio for unbalanced-P2P. If so, it will be a single algorithm 
that gives a constant ratio for both Steiner Forest and fc-Steiner Tree (or fc-MST). Currently, constant ratio 
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algorithms for these two problems use quite different algorithms. Thus a constant approximation algorithm for 
unbalanced-P2p, if possible, will unify techniques for both problems. 
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